Documentation Index
Fetch the complete documentation index at: https://docs.sglang.io/llms.txt
Use this file to discover all available pages before exploring further.
1. Model Introduction
LTX-2 and LTX-2.3 are video generation models from Lightricks. SGLang Diffusion supports the LTX series through native one-stage and two-stage pipelines for text-to-video and image-conditioned video generation.
Use Lightricks/LTX-2 or Lightricks/LTX-2.3 as --model-path. For two-stage generation, SGLang uses the spatial upsampler and distilled LoRA components from the model snapshot by default. LTX-2.3 also supports the HQ two-stage variant.
License notice: LTX-2 and LTX-2.3 are released under the LTX-2 Community License Agreement, not Apache 2.0. The license includes commercial-use restrictions for some entities. Review the official Lightricks license before production or commercial use; SGLang support does not grant additional model usage rights.
2. SGLang-diffusion Installation
Install SGLang with diffusion dependencies:
uv pip install "sglang[diffusion]" --prerelease=allow
For platform-specific setup, see the SGLang Diffusion installation guide.
3. Model Deployment
This section provides deployment configurations optimized for different LTX pipelines and hardware targets.
3.1 Basic Configuration
The LTX series supports one-stage and two-stage pipelines. LTX-2.3 also supports the HQ two-stage pipeline. The recommended launch configuration depends on whether the target GPU can keep both two-stage DiTs resident.
Interactive Command Generator: Use the configuration selector below to generate a deployment command. The default selection targets a single NVIDIA H200 with resident two-stage mode. For multi-GPU serving, start from the 2-GPU or 4-GPU presets and only change parallelism if you need more memory headroom.
3.2 Configuration Tips
Choose the pipeline class based on the quality and latency target:
| Use case | Pipeline class | Notes |
|---|
| One-stage generation | LTX2Pipeline | Fastest LTX native path. Supports T2V and TI2V. |
| Two-stage generation | LTX2TwoStagePipeline | Uses a base stage and a refinement stage. Supported by LTX-2 and LTX-2.3. |
| Two-stage High Quality (HQ) generation | LTX2TwoStageHQPipeline | LTX-2.3 HQ path; defaults to 1920x1088 unless you override --width and --height. |
Feature compatibility:
| Pipeline class | T2V | TI2V (--image-path) | LoRA (--lora-path) | Notes |
|---|
LTX2Pipeline | Yes | Yes | Yes | One-stage path. Cannot be combined with HQ because HQ is a separate two-stage pipeline class. |
LTX2TwoStagePipeline | Yes | Yes | Yes | Standard two-stage path for LTX-2 and LTX-2.3. |
LTX2TwoStageHQPipeline | Yes | Yes | Yes | High Quality two-stage path for LTX-2.3. Use this instead of LTX2Pipeline; it is not a one-stage mode flag. |
For two-stage pipelines, --ltx2-two-stage-device-mode controls transformer residency:
| Mode | When to use it |
|---|
snapshot | Recommended default. Balances latency and VRAM. |
resident | Best latency on high-VRAM GPUs because both DiTs can stay resident. |
original | Closest to the original two-stage switching semantics. |
Other deployment flags:
--lora-path: Preload a community LoRA adapter.
--lora-weight-name: Select the exact safetensors file when the LoRA repository contains multiple weight files.
For native LTX-2.3 two-stage serving without a user LoRA, resident is the fastest high-VRAM path. When you pass --lora-path, SGLang still applies the user LoRA during the two-stage switch, so use resident on H200-class GPUs for enough VRAM, but do not expect the same premerged-stage2 benefit as the no-user-LoRA path.
3.3 Fast multi-GPU presets
For latency-oriented LTX serving, prefer CFG parallel over sequence parallelism. CFG parallel splits guidance branches across GPUs, while SP/Ulysses is mainly a memory/long-sequence tool for LTX.
| Target | Recommended server flags | Notes |
|---|
| 1 high-VRAM GPU | --ltx2-two-stage-device-mode resident | Fastest two-stage setup when both DiTs fit. |
| 1 standard GPU | --ltx2-two-stage-device-mode snapshot | Lower VRAM than resident; use this when H100-class memory is tight. |
| 2 GPUs | --num-gpus 2 --enable-cfg-parallel --ltx2-two-stage-device-mode resident | Fastest common 2-GPU setup. |
| 4 GPUs | --num-gpus 4 --tp-size 2 --enable-cfg-parallel --ltx2-two-stage-device-mode resident | Fastest common 4-GPU layout: TP2 inside each CFG branch. |
| Official comparison | --ltx2-two-stage-device-mode original | Use this only when matching the original stage-switch semantics matters. |
Use --enable-cfg-parallel for degree-2 CFG parallel. Use --cfg-parallel-size only when you explicitly need a different CFG branch count. If resident exceeds available VRAM, keep the same parallelism preset and switch only the device mode to snapshot.
On high-VRAM GPUs, add --text-encoder-cpu-offload false if text encoding latency matters and you have enough memory.
3.3.1 Two GPUs
sglang serve \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--num-gpus 2 \
--enable-cfg-parallel \
--ltx2-two-stage-device-mode resident
3.3.2 Four GPUs
sglang serve \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--num-gpus 4 \
--tp-size 2 \
--enable-cfg-parallel \
--ltx2-two-stage-device-mode resident
4. Model Invocation
4.1 Basic Usage
The examples below spell out the current SGLang sampling defaults for reproducibility:
| Model path | Default output | Default frames | Default steps |
|---|
Lightricks/LTX-2 | 768x512 | 121 | 40 |
Lightricks/LTX-2.3 | 768x512 | 121 | 30 |
Lightricks/LTX-2.3 with LTX2TwoStageHQPipeline | 1920x1088 | 121 | 15 |
4.1.1 LTX-2 one-stage text-to-video
sglang generate \
--model-path Lightricks/LTX-2 \
--pipeline-class-name LTX2Pipeline \
--prompt "A quiet coastal town at sunrise, fishing boats moving slowly through golden mist, cinematic camera movement" \
--save-output
4.1.2 LTX-2.3 one-stage text-to-video
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2Pipeline \
--prompt "A quiet coastal town at sunrise, fishing boats moving slowly through golden mist, cinematic camera movement" \
--save-output
4.1.3 LTX-2 two-stage text-to-video
sglang generate \
--model-path Lightricks/LTX-2 \
--pipeline-class-name LTX2TwoStagePipeline \
--prompt "A handheld shot follows a red tram crossing a rainy city square at night, reflections on the pavement, cinematic lighting" \
--save-output
4.1.4 LTX-2.3 two-stage text-to-video
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--prompt "A handheld shot follows a red tram crossing a rainy city square at night, reflections on the pavement, cinematic lighting" \
--save-output
4.1.5 LTX-2.3 HQ text-to-video
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStageHQPipeline \
--prompt "A wide cinematic shot of alpine clouds rolling over a mountain ridge, soft morning light, slow aerial camera movement" \
--save-output
4.1.6 Image-to-video with one reference image
Pass one image to --image-path for image-conditioned generation:
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--image-path ./inputs/start.png \
--prompt "The camera slowly pushes forward as the subject turns toward warm window light, subtle natural motion, cinematic" \
--save-output
4.1.7 First-to-last-frame transition with two reference images
Pass two images to --image-path for transition-style TI2V. The first image is used as the starting condition and the second image is used as the ending condition.
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--image-path ./inputs/start.png ./inputs/end.png \
--prompt "A smooth cinematic transition from the first scene into the final scene, dynamic camera motion, motion blur, zhuanchang" \
--save-output
4.2 Advanced Usage
Use --lora-path to load a LoRA adapter. If the Hugging Face repo contains multiple safetensors files, use --lora-weight-name to select the exact file. --lora-scale maps to the standard LoRA merge scale and defaults to 1.0.
The following example uses valiantcat/LTX-2.3-Transition-LORA:
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--lora-path valiantcat/LTX-2.3-Transition-LORA \
--lora-weight-name ltx2.3-transition.safetensors \
--prompt "A low-angle tracking shot moves through a foggy forest road. The camera rises above the treetops and transitions into a clear view of a snowy mountain peak under bright sunlight, zhuanchang" \
--save-output
You can combine the Transition LoRA with two reference images:
sglang generate \
--model-path Lightricks/LTX-2.3 \
--pipeline-class-name LTX2TwoStagePipeline \
--image-path ./inputs/start.png ./inputs/end.png \
--lora-path valiantcat/LTX-2.3-Transition-LORA \
--lora-weight-name ltx2.3-transition.safetensors \
--prompt "A fast cinematic transition from the first image to the second image, whip-pan motion, atmospheric lighting, zhuanchang" \
--save-output
Some community LoRAs only include weights for transformer blocks. In that case, SGLang logs a concise coverage summary and leaves unmatched LoRA-capable layers on the base model weights. This is expected when the adapter format intentionally omits those layers.
5. Practical Tips
- Use
--pipeline-class-name LTX2TwoStagePipeline as the default LTX two-stage quality path.
- Use
--pipeline-class-name LTX2TwoStageHQPipeline when you want the HQ path and have enough VRAM for larger outputs.
- Use
--ltx2-two-stage-device-mode resident on high-VRAM GPUs if latency matters more than memory usage.
- Use
--ltx2-two-stage-device-mode original when comparing against official two-stage behavior.
- Keep
--width and --height aligned with the target model resolution; for LTX models, these are output video dimensions.